智能论文笔记

Differentially Private Decoding in Large Language Models

Jimit Majmudar , Christophe Dupuy , Charith Peris , Sami Smaili , Rahul Gupta , Richard Zemel

分类：自然语言处理 | 机器学习

2022-05-26

最近的大规模自然语言处理（NLP）系统对大规模和多样化的语料库使用预先培训的大型语言模型（LLM）。实际上，预训练的模型通过对特定于任务的数据集进行微调来适应各种任务。 LLMS虽然有效，但已被证明可以记住培训数据的实例，从而有可能揭示在预训练期间处理的私人信息。潜在的泄漏可能会进一步传播到LLM经过微调的下游任务。另一方面，保存隐私的算法通常涉及从头开始的重新划痕，这对于LLM来说非常昂贵。在这项工作中，我们提出了一个简单，易于解释的，并且在解码阶段将其轻巧的扰动机制应用于已经训练的模型。我们的扰动机制是模型不可抑制的，可以与任何LLM结合使用。我们提供的理论分析表明，所提出的机制是私人的，实验结果显示了隐私 - 私人权衡权衡。

translated by 谷歌翻译

BSA -- Bi-Stiffness Actuation for optimally exploiting intrinsic compliance and inertial coupling effects in elastic joint robots

Dennis Ossadnik , Mehmet C. Yildirim , Fan Wu , Abdalla Swikir , Hugo T. M. Kussaba , Saeed Abdolshah , Sami Haddadin

分类：机器人

2022-12-30

Compliance in actuation has been exploited to generate highly dynamic maneuvers such as throwing that take advantage of the potential energy stored in joint springs. However, the energy storage and release could not be well-timed yet. On the contrary, for multi-link systems, the natural system dynamics might even work against the actual goal. With the introduction of variable stiffness actuators, this problem has been partially addressed. With a suitable optimal control strategy, the approximate decoupling of the motor from the link can be achieved to maximize the energy transfer into the distal link prior to launch. However, such continuous stiffness variation is complex and typically leads to oscillatory swing-up motions instead of clear launch sequences. To circumvent this issue, we investigate decoupling for speed maximization with a dedicated novel actuator concept denoted Bi-Stiffness Actuation. With this, it is possible to fully decouple the link from the joint mechanism by a switch-and-hold clutch and simultaneously keep the elastic energy stored. We show that with this novel paradigm, it is not only possible to reach the same optimal performance as with power-equivalent variable stiffness actuation, but even directly control the energy transfer timing. This is a major step forward compared to previous optimal control approaches, which rely on optimizing the full time-series control input.

translated by 谷歌翻译

Domain-specific transfer learning in the automated scoring of tumor-stroma ratio from histopathological images of colorectal cancer

Liisa Petäinen , Juha P. Väyrynen , Pekka Ruusuvuori , Ilkka Pölönen , Sami Äyrämö , Teijo Kuopio

分类：计算机视觉 | 机器学习

2022-12-30

Tumor-stroma ratio (TSR) is a prognostic factor for many types of solid tumors. In this study, we propose a method for automated estimation of TSR from histopathological images of colorectal cancer. The method is based on convolutional neural networks which were trained to classify colorectal cancer tissue in hematoxylin-eosin stained samples into three classes: stroma, tumor and other. The models were trained using a data set that consists of 1343 whole slide images. Three different training setups were applied with a transfer learning approach using domain-specific data i.e. an external colorectal cancer histopathological data set. The three most accurate models were chosen as a classifier, TSR values were predicted and the results were compared to a visual TSR estimation made by a pathologist. The results suggest that classification accuracy does not improve when domain-specific data are used in the pre-training of the convolutional neural network models in the task at hand. Classification accuracy for stroma, tumor and other reached 96.1$\%$ on an independent test set. Among the three classes the best model gained the highest accuracy (99.3$\%$) for class tumor. When TSR was predicted with the best model, the correlation between the predicted values and values estimated by an experienced pathologist was 0.57. Further research is needed to study associations between computationally predicted TSR values and other clinicopathological factors of colorectal cancer and the overall survival of the patients.

translated by 谷歌翻译

Informed Circular Fields for Global Reactive Obstacle Avoidance of Robotic Manipulators

Marvin Becker , Philipp Caspers , Tom Hattendorf , Torsten Lilge , Sami Haddadin , Matthias A. Müller

分类：机器人

2022-12-12

In this paper a global reactive motion planning framework for robotic manipulators in complex dynamic environments is presented. In particular, the circular field predictions (CFP) planner from Becker et al. (2021) is extended to ensure obstacle avoidance of the whole structure of a robotic manipulator. Towards this end, a motion planning framework is developed that leverages global information about promising avoidance directions from arbitrary configuration space motion planners, resulting in improved global trajectories while reactively avoiding dynamic obstacles and decreasing the required computational power. The resulting motion planning framework is tested in multiple simulations with complex and dynamic obstacles and demonstrates great potential compared to existing motion planning approaches.

translated by 谷歌翻译

Democratizing Machine Translation with OPUS-MT

Jörg Tiedemann , Mikko Aulamo , Daria Bakshandaeva , Michele Boggia , Stig-Arne Grönroos , Tommi Nieminen , Alessandro Raganato , Yves Scherrer , Raul Vazquez , Sami Virpioja

分类：自然语言处理

2022-12-04

This paper presents the OPUS ecosystem with a focus on the development of open machine translation models and tools, and their integration into end-user applications, development platforms and professional workflows. We discuss our on-going mission of increasing language coverage and translation quality, and also describe on-going work on the development of modular translation models and speed-optimized compact solutions for real-time translation on regular desktops and small devices.

translated by 谷歌翻译

Numerical evidence against advantage with quantum fidelity kernels on classical data

Lucas Slattery , Ruslan Shaydulin , Shouvanik Chakrabarti , Marco Pistoia , Sami Khairy , Stefan M. Wild

分类：机器学习

2022-11-29

Quantum machine learning techniques are commonly considered one of the most promising candidates for demonstrating practical quantum advantage. In particular, quantum kernel methods have been demonstrated to be able to learn certain classically intractable functions efficiently if the kernel is well-aligned with the target function. In the more general case, quantum kernels are known to suffer from exponential "flattening" of the spectrum as the number of qubits grows, preventing generalization and necessitating the control of the inductive bias by hyperparameters. We show that the general-purpose hyperparameter tuning techniques proposed to improve the generalization of quantum kernels lead to the kernel becoming well-approximated by a classical kernel, removing the possibility of quantum advantage. We provide extensive numerical evidence for this phenomenon utilizing multiple previously studied quantum feature maps and both synthetic and real data. Our results show that unless novel techniques are developed to control the inductive bias of quantum kernels, they are unlikely to provide a quantum advantage on classical data.

translated by 谷歌翻译

ON-DEMAND-FL: A Dynamic and Efficient Multi-Criteria Federated Learning Client Deployment Scheme

Mario Chahoud , Hani Sami , Azzam Mourad , Safa Otoum , Hadi Otrok , Jamal Bentahar , Mohsen Guizani

分类：人工智能 | 机器学习

2022-11-05

In this paper, we increase the availability and integration of devices in the learning process to enhance the convergence of federated learning (FL) models. To address the issue of having all the data in one location, federated learning, which maintains the ability to learn over decentralized data sets, combines privacy and technology. Until the model converges, the server combines the updated weights obtained from each dataset over a number of rounds. The majority of the literature suggested client selection techniques to accelerate convergence and boost accuracy. However, none of the existing proposals have focused on the flexibility to deploy and select clients as needed, wherever and whenever that may be. Due to the extremely dynamic surroundings, some devices are actually not available to serve as clients in FL, which affects the availability of data for learning and the applicability of the existing solution for client selection. In this paper, we address the aforementioned limitations by introducing an On-Demand-FL, a client deployment approach for FL, offering more volume and heterogeneity of data in the learning process. We make use of the containerization technology such as Docker to build efficient environments using IoT and mobile devices serving as volunteers. Furthermore, Kubernetes is used for orchestration. The Genetic algorithm (GA) is used to solve the multi-objective optimization problem due to its evolutionary strategy. The performed experiments using the Mobile Data Challenge (MDC) dataset and the Localfed framework illustrate the relevance of the proposed approach and the efficiency of the on-the-fly deployment of clients whenever and wherever needed with less discarded rounds and more available data.

translated by 谷歌翻译

Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

Youness Moukafih , Mounir Ghogho , Kamel Smaili

分类：自然语言处理

2022-09-28

最近，已证明有监督的对比度学习（SCL）在大多数分类任务中都能取得出色的表现。在SCL中，对神经网络进行了训练，可以优化两个目标：在嵌入空间中将锚定和阳性样品一起拉在一起，并将锚点推开。但是，这两个不同的目标可能需要冲突，需要在优化期间之间进行权衡。在这项工作中，我们将SCL问题作为Roberta语言模型的微调阶段的多目标优化问题。使用两种方法来解决优化问题：（i）线性标量（LS）方法，该方法可最大程度地减少持久性损失的加权线性组合；（ii）确切的帕累托最佳（EPO）方法，该方法找到了帕累托正面与给定优先矢量的相交。我们在不使用数据增强，内存库或生成对抗性示例的情况下评估了几个胶合基准任务的方法。经验结果表明，提出的学习策略大大优于强大的竞争性学习基线

translated by 谷歌翻译

Application of Deep Learning in Generating Structured Radiology Reports: A Transformer-Based Technique

Seyed Ali Reza Moezzi , Abdolrahman Ghaedi , Mojdeh Rahmanian , Seyedeh Zahra Mousavi , Ashkan Sami

分类：自然语言处理 | 人工智能 | 机器学习

2022-09-25

由于临床实践所需的放射学报告和研究是在自由文本叙述中编写和存储的，因此很难提取相对信息进行进一步分析。在这种情况下，自然语言处理（NLP）技术可以促进自动信息提取和自由文本格式转换为结构化数据。近年来，基于深度学习（DL）的模型已适用于NLP实验，并具有令人鼓舞的结果。尽管基于人工神经网络（ANN）和卷积神经网络（CNN）的DL模型具有显着潜力，但这些模型仍面临临床实践中实施的一些局限性。变形金刚是另一种新的DL体系结构，已越来越多地用于改善流程。因此，在这项研究中，我们提出了一种基于变压器的细粒命名实体识别（NER）架构，以进行临床信息提取。我们以自由文本格式收集了88次腹部超声检查报告，并根据我们开发的信息架构进行了注释。文本到文本传输变压器模型（T5）和covive是T5模型的预训练域特异性适应性，用于微调来提取实体和关系，并将输入转换为结构化的格式。我们在这项研究中基于变压器的模型优于先前应用的方法，例如基于Rouge-1，Rouge-2，Rouge-L和BLEU分别为0.816、0.668、0.528和0.743的ANN和CNN模型，同时提供了一个分数可解释的结构化报告。

translated by 谷歌翻译

Robust Ensemble Morph Detection with Domain Generalization

Hossein Kashiani , Shoaib Meraj Sami , Sobhan Soleymani , Nasser M. Nasrabadi

分类：计算机视觉

2022-09-16

尽管大量研究专门用于变形检测，但大多数研究都无法推广其在训练范式之外的变形面。此外，最近的变体检测方法非常容易受到对抗攻击的影响。在本文中，我们打算学习一个具有高概括的变体检测模型，以对各种形态攻击和对不同的对抗攻击的高度鲁棒性。为此，我们开发了卷积神经网络（CNN）和变压器模型的合奏，以同时受益于其能力。为了提高整体模型的鲁棒精度，我们采用多扰动对抗训练，并生成具有高可传递性的对抗性示例。我们详尽的评估表明，提出的强大合奏模型将概括为几个变形攻击和面部数据集。此外，我们验证了我们的稳健集成模型在超过最先进的研究的同时，对几次对抗性攻击获得了更好的鲁棒性。

translated by 谷歌翻译